Predicting Co-Occurrence Restrictions By Using Semantic Classifications In The Lexicon

نویسندگان

  • Elena V. Paducheva
  • Ekaterina V. Rakhilina
چکیده

In this paper we investigate general principles of constructing semantic classifications that yield useful predictions combinatory options Several semantic Russian words are concernin~ of words. classes of discussed, implemented in an expert system named "Lex i co~raphe r", the "Lexicographer" is supposed to provide its users with all kind of information concerning some 15.000 most common Russian words. Alon~ with morphological, syntactic infor.~tion conventional system information and semantic usually stored in dictionaries, the should contain about referential characteristics of words and about restraints in combinability with other words in syntactic constructions of different types. In its final version "Lexicographer" should provide the users with all sorts of bibliographical information system beinff conceived as an aid both in the area of natural language processin~ arid in traditional lexicography. Semantic Features proposed regulate co-occurence of verbs with their non-obli?atory dependents-such as Modifiers of place or time; Instrumental and Benefactive objects and the like. (concern i n~ both i nd i v i dual words and semantic classes of words) and with concordances made on the bas i s of a suffi c i ent ly re presentat i ve corpus of Russian texts. One of the basic components of the system is its lexicon; the lexicon contains information not only about individual lexer~s, but also about sen~ntic and syntactic classes of lexemes. Thus, for nominal lexe~es such features are ~iven as: "NATURAL CLASS", "ARTEFACT", "MASS TERM", "SET", "BODY PART" and the like.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Using Co-Composition For Acquiring Syntactic And Semantic Subcategorisation

Natural language parsing requires extensive lexicons containing subcategorisation information for specific sublanguages. This paper describes an unsupervised method for acquiring both syntactic and semantic subcategorisation restrictions from corpora. Special attention will be paid to the role of co-composition in the acquisition strategy. The acquired information is used for lexicon tuning and...

متن کامل

A Constraint-based Case Frame Lexicon Architecture

In Turkish, (and possibly in many other languages) verbs often convey several meanings (some totally unrelated) when they are used with subjects, objects, oblique objects, adverbial adjuncts, with certain lexical, morphological, and semantic features, and co-occurrence restrictions. In addition to the usual sense variations due to selectional restrictions on verbal arguments, in most cases, the...

متن کامل

Semantic Representation and Priming in a Self-organizing Lexicon

This paper presents a model of the mental lexicon and its formation, based on the self-organizing neural network. When exposed to raw text, the model clusters words according to their semantic relatedness to form a semantic network 7]. Simulations using artiicial data are described that show how co-occurrence information can be used to create a low-dimensional representation of lexical semantic...

متن کامل

Semantic Features And Selection Restrictions

One of the essential aspects is described of an expert system (called LEXICOGRAPHER), designed to supply the user with diverse information about Russian words, including bibliographic information concerning individual lexical entries. The lexical database of the system contains semantic information that cannot be elicited from (he existing dictionaries. TJie priority is given to semantic featur...

متن کامل

Word Co-occurrence Counts Prediction for Bilingual Terminology Extraction from Comparable Corpora

Methods dealing with bilingual lexicon extraction from comparable corpora are often based on word co-occurrence observation and are by essence more effective when using large corpora. In most cases, specialized comparable corpora are of small size, and this particularity has a direct impact on bilingual terminology extraction results. In order to overcome insufficient data coverage and to make ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1990